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CLAIM AMENDMENTS 



1. -7. (canceled) 



1 8 . (previously presented) A method of animating a 

2 synthesized model of a human face driven by an audio driving 

3 signal, comprising an analytic phase, in which 

4 an alphabet of low level vi semes is determined, and 

5 a synthesis phase, in which 

6 the audio driving signal is converted into a sequence of 

7 low level vi semes applied to a model, wherein said analytic phase 

8 comprises the steps of 

9 extracting both a set of information representing a shape 

10 of a speaker's face and corresponding sequences of phonetic units 

11 from a set of audio training signals; 

12 compressing said set of information into active shape 

13 model parameter vectors representative of phonetic units; 

14 associating to said active shape model parameter vectors 

15 representative of phonetic units an interpolation function to 

16 provide a continuous representation of movement between phonemes, 

17 wherein said interpolation function is a convex combination having 

18 combination coefficients variable as a continuous function of time 

19 whereby said association determines said alphabet of low level 

20 vi semes; 
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21 associating low level parameters of facial animation, 

22 compliant with Standard ISO/IEC 14496 VER. 1/ to said low level 

23 visemes; 

24 wherein said synthesis phase comprises the steps of 

25 extracting a sequence of phonetic units of an audio 

26 driving signal; 

27 associating to said sequence of phonetic units extracted 

28 in said synthesis phase a corresponding sequence of low level 

29 visemes as determined in the analytic phase; 

30 transforming said sequence of low level visemes of said 

31 synthesis phase through an interpolation function to provide a 
$2 continuous representation of movement between phonemes, wherein 
$3 said interpolation function of said synthesis phase is a convex 

34 combination having combination coefficients variable as a continu- 

35 ous function of time; and 

36 wherein the combination coefficients carried out in the 

37 synthesis phase are the same as those used in the analytic phase. 

1 9. (previously presented) The method according to claim 

2 8, wherein the combination coefficients B a (t) of said convex combi- 

3 nations are functions of the following type: 
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5 where t n is the instant of utterance of the nth phonetic units* 

1 10. (previously presented) The method according to claim 

2 9 wherein the wire- frame vertices, corresponding to model feature 

3 points, on the basis of which facial animation parameters are 

4 determined in the analytic phase, are identified and said low-level 

5 viseme interpolation operations are conducted by applying trans - 

6 forms on feature points for each low- level viseme, for animating a 

7 wire-frame based model. 

1 11. (previously presented) The method according to claim 

2 10 wherein for each position to be assumed by the model in said 

3 synthesis phase, the transforms are applied only to the vertices of 

4 the wire- frame corresponding to the feature points and the trans - 

5 forms are extended to remaining vertices by means of a convex 

6 combination of the transforms applied to the vertices of the wire- 

7 frame corresponding to the feature points. 

1 12. (previously presented) The method according to claim 

2 8 wherein said low- level vis ernes are converted into co-ordinates of 

3 the feature points of the face of the speaker, followed by conver- 

4 sion of said co-ordinates into low- level facial animation parame- 

5 ters compliant with Standard ISO/IEC 14496 VER.l. 

1 13. (previously presented) The method according to claim 

2 12 wherein said low- level facial animation parameters, representing 
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3 the co-ordinates of feature points, are obtained in the analytic 

4 phase by analyzing movements of a set of markers which identify the 

5 feature points. 

1 14. (currently amended) The method according to claim 

2 13 wherein data representing the co-ordinates of the feature points 

3 of the face are normalized according to the following method: 

4 a sub- set of markers are associated to a stiff object 

5 applied to the forehead of the speaker; 

6 the face of the speaker is set, at the beginning of the 

7 recording, to assume a position corresponding as far as possible to 

8 the position of a neutral face model, as defined in standard 

9 ISO/IEC 14496 VER . 1 , and a first frame of the face in such neutral 

10 position is obtained; and 

11 for all frames subsequent to the first frame, the sets of 

12 co-ordinates are rotated and translated so that the co-ordinates 

13 corresponding to the markers of said sub- set coincide with the 

14 co-ordinates of the markers of the same sub- set in the first frame. 

1 15. (currently amended) A method of generating an 

2 alphabet of low level vis ernes for animating a synthesized model of 

3 a human face driven by an audio signal, comprising the steps of 

4 extracting both a set of information representing the 

5 shape of a opoakor is speaker' s face and corresponding sequences of 

6 phonetic units from a set of audio training signals; 
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7 compressing said set of information into active shape 

8 model (ASM) parameter vectors; and 

9 associating to said active shape model (ASM) parameter 

10 vectors representative of phonetic units an interpolation function 

11 to provide a continuous representation of movement between pho- 

12 nemes, wherein said interpolation function is a convex combination 

13 having combination coefficients variable as a continuous function 

14 of time whereby said association determines said alphabet of low 

15 level vi semes* 



1 16. (previously presented) The method according to 

2 claim 15 wherein the combination coefficients B n (t) of said convex 

3 combinations are functions of the following type: 



&(<) = 



cos 

cos 3 
0; 



where t n is the instant of utterance of the nth phonetic units, 
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